Empirically Grounding the Evaluation of Creative Systems: Incorporating Interaction Design

نویسنده

  • Oliver Bown
چکیده

In this paper I argue that the evaluation of artificial creative systems in the direct form currently practiced is not in itself empirically well-grounded, hindering the potential for incremental development in the field. I propose an approach to evaluation that is grounded in thinking about interaction design, and inspired by an anthropological understanding of human creative behaviour. This requires looking at interactions between systems and humans using a richer cultural model of creativity, and the application of empirically bettergrounded methodological tools that view artificial creative systems as situated in cultural contexts. The applicability of the concepts ‘usability’ and ‘user experience’ are considered for creative systems evaluation, and existing evaluation frameworks including Colton’s creativity tripod and Ritchie’s 18 criteria are reviewed from this perspective. Introduction: Evaluation, Creativity and Empiricism This paper is concerned with the evaluation of creative systems, specifically in the area of artistic creativity (not to be confused with evaluation by creative systems). Whilst AI researchers in other application domains are able to observe and measure incremental improvements in their algorithms, computational creativity researchers are burdened by the inherent ambiguity in the field regarding whether algorithm or system X is better than algorithm or system Y. Incremental developments in the field are also relatively obscure to the outsider: the figurative artworks created by Harold Cohen’s celebrated automated artist, AARON, in the 1980s1 look like the work of a competent and creative artist. As far as the artwork itself is concerned, this would appear to be as good as it gets – problem solved. But most in the field believe that we are only just beginning to develop good creative systems. Such appearances foster confusion about where we are at in the development of significant artistic creativity in computers, between a far-off goal on the one hand, and a solved problem on the other. Cardoso, Veale, and Wiggins (2009) characterise the field as taking a pragmatic, demonstrative approach to compuSee AARON’s online biography at http://www.usask.ca/art/ digital culture/wiebe/moving.html. tational creativity practice, “which sees the construction of working models as the most convincing way to drive home a point” (Cardoso, Veale, and Wiggins, 2009, p. 19). This tradition has kept the focus on innovation, distinguishing it from more theoretical studies of creativity. Nevertheless the discussions and demonstrations that surround such an approach depend on a firm relationship between empirical observations and what we claim about systems. Hence, understandably, a significant portion of the literature in the field focuses on the ‘necessary theoretical distraction’ of how to go about evaluating systems. Wiggins’ notion of evaluation (Wiggins, 2006), widely adopted in the field, requires that a system performs tasks in a way that would be deemed creative if performed by a human. But whilst simple to state, the task of concretely drawing such a conclusion about a given system maintains an opaque and vexing relationship to the various forms of empirical observations available to us. In light of these issues, the purpose of this paper is to examine the empirical grounding underlying the evaluation of systems. Empirical grounding is defined as the practice of anchoring theoretical terms to scientifically measurable events, and is necessary for the “effectiveness of the application of knowledge” (Goldkuhl, 2004), that is essential for transforming discussions about system designs and methods into incremental scientific progress. I argue that whilst the essential incompatibility between evaluation in computational creativity and the objective nature of optimisation found in AI may have been acknowledged from the outset, there remains a gap that has still not yet been plugged by a positive theory of evaluation in computational creativity. Further to this, I propose that the standard model of creativity in art, derived largely from Boden’s concepts, has not provided a suitable framework for thinking about where and how the evaluation of creativity applies in human artistic behaviour. To address this, it is proposed that a human-centred view, specifically the use of designbased approaches such as interaction design, can give computational creativity a thorough empirical grounding. An interaction design approach can be applied easily to existing work in computational creativity, viewing the understanding and measurement of system behaviours in terms of their interaction with human ‘users’. It offers a practical route to bringing a much-needed human and social dimension to studies of creative systems without rejecting aspirations towards autonomy in computational creativity software. The Soft Side of Computational Creativity The adjectives ‘hard’ and ‘soft’ have been used, controversially, to refer to different areas of scientific enquiry (as a precaution, they remain in quotes throughout this paper!). Diamond (1987) explains that some “areas are given the highly flattering name of hard science, because they use the firm evidence that controlled experiments and highly accurate measurements can provide,” whereas “soft sciences, as they’re pejoratively termed, are more difficult to study for obvious reasons... You can’t start... and stop [experiments] whenever your choose. You can’t control all the variables; perhaps you can’t control any variable. You may even find it hard to decide what a variable is” (Diamond, 1987, p. 35). Although many theoreticians such as Diamond reject the tone of the terms (here Diamond is arguing that soft sciences are in fact harder than hard sciences), the definitions given here usefully describe a continuum of what he understands as ‘degrees of operationalisation’. Whilst the terms may connote ‘tough’ and ‘weedy’ respectively, they also connote ‘rigid’ and well-defined levels of operationalisation versus more ‘flexible’ and loosely-defined levels of operationalisation. This distinction remains useful. A key point is that there are appropriate ways to deal with ‘soft’ concepts, above all of which is to acknowledge them as such in order to apply suitable methods. A popular perception is that ‘soft’ sciences harden as their theory and practice coevolve, with psychology and sociology given as typical examples (Nature, 2005). But doing quality ‘soft science’ would appear to be the first step towards this ambition. Computational creativity necessarily deals with both sorts of concepts, and researchers must therefore know how to work across this spectrum. I discuss as an example Colton’s ‘creativity tripod’ (2008). Colton proposes to include in his formulation of evaluation a set of internal properties of systems, due to the limited information available when using only the end products of an automated creative process to evaluate that process (as advocated by Ritchie (2007)). He proposes that we look inside the system itself in order to gain a fuller description of the system’s processes along with its products, and thus make a more informed decision about the creativity of the system. This, he argues, is more in line with how we evaluate human creativity: “A classic example... is Duchamp’s displaying of a urinal as a piece of art. In situations like these, consumers are really celebrating the creativity of the artist rather then the value of the artefact” (Colton, 2008, p. 15) Colton suggests breaking down creativity into three components – a ‘creativity tripod’ of skill, appreciation and imagination – that can be sought in creative systems. He defines each of these as necessary conditions for the identification of creativity, and proposes that creativity evaluation could be built around an analysis of these properties. He performs such an analysis of his own systems, HR and The Painting Fool, and identifies the existence of each component in both systems (although he clarifies that they do not occur simultaneously in the same version of the Painting Fool system). In Colton’s analysis, skill, appreciation and imagination are not formalised, and are treated as intuitive ideas taken in the manner of Wiggins’ ‘creativity as recognised by a human’ criterion. Accordingly, Colton’s application of the terms is impressionistic. For example, he says of The Painting Fool’s imagination that “we wrote a scene generation module that uses an evolutionary approach to build scenes containing objects of a similar nature, such as city skylines and flower arrangements” (Colton, 2008, p. 21). From this, the reader has little hope of determining whether the ‘imagination’ criterion has been satisfied, let alone what the subcriteria are for imagination. A further problem is that, in empirical terms, the expected order of knowledge discovery has clearly been put in reverse: imagination has been defined first as a kind of internal scene generation process, then implemented into the system, the conclusion being drawn that the system contains imagination. This abandons the critical step of enquiry into whether, having defined imagination as such and implemented it accordingly, this is actually a sufficient definition of imagination. Under these circumstances, the concepts skill, appreciation and imagination cannot be distinguished from trivial pseudo-versions of themselves. Accordingly, reduction to triviality provides an easy rebuttal to such claims, and this has been performed by Ventura on Colton’s criteria (Ventura, 2008). Ventura presents a clearly trivial, unanimously uncreative computer program, and applies a similar analysis to that performed originally by Colton, concluding that the mock system has skill, appreciation and imagination. If Venutra’s system has these features, and they are sufficient for the attribution of creativity, then we must either accept the system as creative or reject the criteria as they currently stand. Can such vague concepts be used at all, or should they dropped altogether if they can’t be precisely formalised? I prefer to support both Colton’s initial premise – that an understanding of the inner workings of systems is as necessary to evaluating creativity as the outputs the system produces – and his identification of skill, appreciation and imagination as critical features of advanced creative systems. They are things that we would expect to see well implemented in our finest systems and there is nothing wrong with making this intuitive step. But unfortunately they are clumsy terms, and as Ventura’s analysis demonstrates, don’t look like hopeful performers at a formal level. In Diamond’s terms, they are far from being effectively operationalised, and they may never be operationalised, because in the process we would reasonably expect to device concepts that are far removed from folk terminology, just as physicists and neuroscientists have done. A more generic scientific strategy for how to work with both rigid and flexible objects alike comes from the definitive hard scientist Richard Feynman (1974) who makes a simple appeal to what he describes as an unspoken law of science, “a kind of utter honesty–a kind of leaning over backwards” to face the problem of “how not to fool ourselves” (Feynman, 1974). He draws an analogy between forms of habitual scientific practice and the famed cargo cults of the South Pacific, who carved wooden headphones and bamboo antennae in the hope of attracting cargo planes to land, imitating the troops they had seen during WWII. He calls upon scientists across disciplines to ask themselves, Am I making symbolic wooden headphones or real working headphones? In the spirit of Feynman’s call to ‘utter honesty’, an overlooked first step is to acknowledge that these terms, given our current knowledge, are extremely flexible and far-fromoperationalised, which places very different demands on how we address and manipulate them as concepts. Their treatment is implicitly argument-based, meaning that no neat proof or direct basis in evidence is available to us. This makes for a very messy equivalence to the process of checking the steps of a proof or repeating a simulation experiment, with each step containing unknowns and vagaries: flexible rather than rigid science. Computational creativity needs to learn to work with vague concepts that are not easily subject to formal treatment. Other examples of slips into the space of soft science that are likely to occur in computational creativity discourse include describing a system as ‘doing something on its own’ when discussing the autonomy of systems, but remaining imprecise about what the ‘it’ and the ‘doing’ specify (e.g., to say a program composes a piece of music ‘on its own’ requires quite a detailed analysis of the sequence of events leading to the specific configuration of musical content), and cases of comparing exploratory and transformational creativity in an interpretive manner (e.g., to classify any historical creative act as transformational requires the imposition of our own chosen categories onto incomplete historical data) (see Ritchie, 2006, for an interesting discussion). For this reason ‘soft sciences’, such as social anthropology, subject the use of language to great scrutiny. The meaning of terms that cannot easily be made measurable or mathematically manipulable are instead treated with an acknowledgement of their fragility. As a part of their data gathering, anthropologists immerse themselves in cultural situations in order to be able to fully understand and successfully interpret what they observe. Immersion is necessary in order to expose the cultural content of these situations, which is not directly accessible through ‘hard science’ methods such as surveys, lab tests or recordings. For example, the difference between a twitch of the eye, a wink, a fake wink, a parodied wink, a burlesque of a parodied wink, and so on, might only be fully accessible to someone who has an intimate understanding of the sociocultural context in which the act occurs (Geertz, 1973). Misinterpretation of such acts is a clear source of error in the development of theory. In the 1980s, borrowing from philosopher Gilbert Ryle, anthropologist Clifford Geertz (Geertz, 1973) developed these practices into a method of ‘thick description’ that gave new impetus to, and validation of, the interpretative (‘soft’) side of anthropology as a science. Such thinking is more relevant to computational creativity than it may appear. The empirical material underlying Wiggins’ ‘creativity as recognised by a human’ criterion, is in the first instance anthropological rather than psychological, revolving around interpretations of culturally-situated human behaviour: in particular that we establish a shared understanding of what ‘creative’ means. Geertz’ advice on grounding methodology is that “if you want to understand what a science is, you should look in the first instance not at its theories or findings, and certainly not at what its apologists say about it; you should look at what the practitioners of it do” (Geertz, 1973, p. 5). This is a call to work the science’s methods around the data and practices that are practically available. This may be helpful given what computational creativity practitioners do. Cardoso, Veale and Wiggins’ characterisation of computational creativity practice as the construction of “working models as the most convincing way to drive home a point” (Cardoso, Veale, and Wiggins, 2009, p. 19), breaks down into two parts: the engineering excellence to create advanced creative systems, and the analysis of human social interaction in creative contexts that will be used to round off the argument. Thus a necessary direction for computational creativity is to fuse excellence in the ‘hard science’ area of algorithms and the ‘soft science’ of understanding human social interaction. The terms skill, appreciation and imagination are things that we should be seeking to better define through (‘soft’) computational creativity research, and cannot at the same time be used as the basis for a (‘hard’) test for creativity. Characterising Artistic Creativity Using Generative and Adaptive Creativity Value or utility is included in the vast majority of definitions of creativity (most notably (Boden, 1990)), and is critical to many applications of creativity research, such as improving organisational creativity and building creative cities. But non-cognitive processes such as biological evolution are also viewed as creative. Here, value cannot have the same meaning as it does in the context of human cogintion-based creativity, because there is no agent to do the valuing. And yet this difference has not been explored in any depth. The application of theoretical concepts has tended to focus on Boden’s (1990) two key distinctions in her analysis of creativity: between personal and historical creativity as indications of scope; and between combinatorial, exploratory and transformational creativity as forms of creative succession. From this point of view, creativity is tightly bound to individual human goals, and is primarily conceived of as a cognitive process that is used to discover new things of value. This lack of attention to the variable nature of value in creativity causes confusion and has led to a poor empirical grounding for evaluation in computational creativity, precisely because much creativity occurs outside of the process of human creative cognition (in the narrower sense given above). A distinction based on different relations to value has not been taken up by the community. I draw on a distinction (Bown, 2012) between ‘generative’ and ‘adaptive’ creativity, and argue that this distinction clarifies and resolves the confusion about how value is manifest in the arts. In (Bown, 2012) I propose a distinction between two forms of creativity based on their relationship to value: generative and adaptive creativity. Generative creativity is defined with a very broad scope, it occurs wherever new types of things come into existence. It does not require cognition: non-human processes such as biological evolution are capable of creating new types of things, and, I argue, there are also examples of human activity in which things emerge ‘autopoietically’ without being planned or conceived of by individual humans. The role of generative creativity in art will be discussed below. Generative creativity offers an expanded view of creativity in which the production of new types of thing is the sole criterion for creativity to have occurred, and the process by which those things are produced – whether by deities, human minds or autopoietic processes – is secondary. In human creativity, this liberates us from the possibly misleading premise that the ‘creative mind’ is necessary and sufficient for the ‘act of creation’. A framework that distinguishes between those entities can properly address the issue of when and how human thinking is associated with new things coming into existence. Adaptive creativity on the other hand is that in which something is created by an intelligent agent in response to a need or opportunity. The distinguishing feature here is that of value or benefit – generative creativity is ‘value free’. In adaptive creativity, the agent doing the creation stands to benefit from the creative act: a link must exist between the creative agent and the beneficial return of the creative act in order for adaptive creativity to have occurred. Uncontroversial examples include solving everyday problems, such as using a coat-hanger to retrieve something from behind a wardrobe. Adaptive creativity is understood as requiring certain cognitive abilities such as mental representation, whereas generative creativity is completely blind, as in biological evolution. Generative and adaptive creativity are not extremes at either ends of a continuum, but distinct and mutually exclusive categories – either there was a preceding purpose or there was not. However, the appearance of new things may be the sum of different episodes of generative and adaptive creativity. Given these terms, I argue that the existing notion of the evaluation of creative systems is entirely – indeed inherently – geared towards adaptive creativity, and is unable to accommodate generative creativity at all. Adaptive creativity alone is compatible with computational creativity’s AI legacy, which preferences an optimisation or search approach to discovering valuable artefacts. This is not without powerful applications. Evolutionary optimisation regularly discovers surprising designs in response to engineering problems. Thaler’s “Creativity Machine”, for example, was used to discover novel toothbrush designs using a relatively traditional optimisation approach involving a clear objective function (Plotkin, 2009). It is only generative creativity that is incompatible with optimisation. Adaptive and Generative Creativity in the Arts For the purpose of evaluating creative systems, it has been considered reasonable to assume that we can treat artistic domains entirely in terms of adaptive creativity, and that the act of creating artworks is an adaptively creative act. Accordingly one can view the production of an artwork as an optimisation or search problem. This simplification is built in to the premise of an agent designed to evaluate its output in order to find good solutions. For such an agent to incorporate generative creativity into its behaviour would mean that the value of its output was indeterminate and evaluation would be frustrated. But evidence suggests that this view of art does not hold when one considers its social functions. I will focus on music for the purpose of this discussion, and take what I believe is an uncontroversial understanding of music insofar as sociologists of music are concerned. Hargreaves and North (1999) identify three principal social functions for music: self-identity, interpersonal relationships and mood. These in turn, they argue, shape musical preference and practice. For example, “research on the sociocultural functions of music suggests that it provides a means of defining ethnic identity” (Hargreaves and North, 1999, p. 79). The evidence they gather shows the perceived aesthetic value of music not to be determined purely by exposure to a corpus or ‘inspiring set’, but also by a set of existing social relationships. More recent research in experimental psychology reveals an increasingly complex story behind how we give value to creative artefacts. Salganik, Dodds, and Watts (2006), for example, show that music ratings are directly influenced by one’s perception of how others rated the music, not just in the long term but at the moment of making the evaluation. Newman and Bloom (2012) examine the underlying causes of the attachment of value to originals rather than copies, finding, amongst other things, that the value given to an original is associated with its physical contact with the artist. Both studies suggest a form of winner-takesall process whereby success begets further success. Such phenomena place limits on the importance of the creative content in evaluation. Admittedly artistic success is not the same as artistic creativity, but the overlap is great enough, in any practical sense of evaluating creativity, to carry the argument from one domain to the other. Csikszentmihalyi’s (1999) domain-individual-field theory has long held that individuals influence domains and alter fields, but such observations have on the whole been only been acknowledged, not actually applied in computational creativity. Coming close, Charnley, Pease, and Colton (2012) present ‘framing’ as a way to deal with the process of adding additional information that may influence the value of a creative output. According to the idea of framing, I might provide information alongside an artwork, such as an exhibition catalogue entry, that influences its perception. In its simple form framing would embellish an artwork, perhaps explaining some hidden symbolism behind the materials used. But in this sense it is simply a part of the system output along with the artwork. By comparison, verbal statements, and other social actions, can have effects with respect to value that are categorically different from this, for example by provoking people to alter their perception of value in general. Framing takes steps towards the idea that value can be manipulated, even ‘created’, but continues to assume a fixed frame of reference. Taking these additional processes into account, when an individual produces an artwork, some amount of the value of that artwork may have already been determined by factors that are not controlled by the individual, or be later determined by factors that are unrelated to the content of the work. The creativity invested in the creation is not entirely the product of the individual, whose artistic behaviour may be more associated with habit and enculturation than discovery, but is imposed upon the individual through their context and life history. The anthropological notion of the ‘dividual’, or ‘porous subject’ (Smith, 2012) has been used capture this idea of a person as being composed of cultural influences, indicating their ongoing permeability to influence. According to this view, the flux of influence between individuals may have an equivalence to the interaction between submodules within a single brain, meaning that isolating individuals as units of study is no better a division than focusing on couples, tuples, larger groups or cognitive submodules. Given this understanding of individual human behaviour in relation to culture in general, and the arts in particular, computational creativity can be seen to place too much emphasis on the idea of individuals being independent creators. From this alternative point of view it is argued that artistic behaviour has a significant generative creativity element by which new forms ‘spring up’, not because individuals think of them, but through a jumble of social interaction. Such emergent forms may have structural properties related to the process that produced them, but they were not made with purpose. By analogy, consider a classic debate about adaptationism and form in evolutionary theory: the shape of a snail shell, as described in Thompson’s On Growth and Form (Thompson, 1992) comes about through the process of evolutionary adaptation. But this is not purely a product of the selective pressures acting on the species. It results from an interaction between selective pressures and naturally-occurring structure. Likewise, human acts of creation are constrained by structural factors that guide the creator, augmenting agency. The notion that a system possesses a level of creativity is riddled with complexity, owing to the fact that creativity is as much something that is enacted upon individual systems as enacted by them. In computational creativity, this means that the goal of evaluating virtual autonomous artists is not empirically well-grounded when performed in isolation. Empirical grounding requires a strong coherence between our theories and practices, and the things we can observe. In the following section, I will argue that an interaction design approach delivers this coherence, bringing together system development with a thorough understanding of the culturally-situated human. I will suggest that interaction design shouldn’t be viewed merely as an add-on or a form of research used only at the application stage, but that it has a central role to play in improving methodology in computational creativity. Towards Empirical Grounding To reiterate the argument so far, empirical grounding is defined as the process of anchoring theoretical terms to scientifically measurable events. Computational creativity characteristically employs a makers’ approach to innovating new ideas and building better systems, but the idea of asking how creative these systems are is not empirically well-grounded. Then what can we ask? I have examined the need simply to elaborate on terms and concepts during the process of evaluation, adopting approparite ‘soft science’ ways of thinking alongside the existing engineering mindset, but although a well-grounded approach needs to take this into account, it does not provide a grounding itself. Two research methodologies already well integrated into computational creativity offer a basis for empirically wellgrounded research. These are interaction design and multiagent systems modelling. In both cases the imbalance between generative creativity and adaptive creativity is addressed. In the interaction design approach, creative systems are treated as objects that are inevitably situated in interaction with humans. The nature of that interaction, including its efficacy, is treated as the primary concern. Here the empirical grounding comes from the fact that properties of interaction and experience related to the analysis of usability and user experience can be observed and measured, whilst existing notions of creativity evaluation can easily be incorporated into theories of interaction design. This need not be limited to a creative professional working with a piece of creative software, but could apply to any form of interaction between person and creative system. In the modelling approach, artificial creative systems are treated as models of human creative systems. For the reasons discussed above, it does not suffice to test the success of model systems by attempting to evaluate their output, but many other observable and measurable aspects of human creativity can be studied. Multi-agent models of social networks are particularly appealing in this regard because generatively creative processes fall inside the scope of the system being studied, alleviating the tension between adaptive and generative creativity. In this paper I only elaborate on the interaction design approach, firstly because it is more immediately applicable to computational creativity practice, and secondly because much of what can be said about empirically grounded modelling is well-known to researchers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Back-flashover Investigation of HV Transmission Lines Using Transient Modeling of the Grounding Systems

The article presents the transients analysis of the substation grounding systems and transmission line tower footing resistances which can affect to the back-flashover (BF) or overvoltage across insulator chain in an HV power systems by using EMTP-RV software. The related transient modeling of the grounding systems is based on a transmission line (TL) model with considering the soil ionization....

متن کامل

Improving seismic performance of elevated cylindrical water storage tanks using nonlinear isolators incorporating liquid–structure interaction

Pervasive construction of elevated liquid tanks as the most important sources for urban hygienic water, and their failure under earthquake, highlighs the necessity for research toward new passive control devices. Design and use of base-isolation systems for elevated liquid tanks is propounded as a novel seismic engineering technology. To date, no special examination has been discussed in relati...

متن کامل

PERFORMANCE BASED OPTIMAL SEISMIC DESIGN OF RC SHEAR WALLS INCORPORATING SOIL–STRUCTURE INTERACTION USING CSS ALGORITHM

In this article optimal design of shear walls is performed under seismic loading. For practical aims, a database of special shear walls is created. Special shear walls are used for seismic design optimization employing the charged system search algorithm as an optimizer. Constraints consist of design and performance limitations. Nonlinear behavior of the shear wall is taken into account and per...

متن کامل

Metaphor: a Creative aid in Architectural Design Process

In the developing world, skills in innovation and creative design have emerged as key attributes for graduating designers. Creativity is essential if we want to generate new solutions to the considerable and complex problems in architecture. Metaphor is frequently expressed as a key tool for enhancing creative design, yet little empirical research has been performed on how novice designers can ...

متن کامل

“A Comparative Exploration of the Phenomenological Conception of Creative Imagination and Its Role in the Digital and Non-Digital Architectural Design Processes”

Imagination and its relation to creativity are among the most important issues related to the design category in various areas of architectural inquiry such as architectural design, whose function and role have changed in recent decades due to use and application of computers in the architectural design processes. Given all the changes occurred in the meaning and concept of architecture, by the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014